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Among  the  different  forms  of  dean  energies,  solar  energy  has  attracted  a  lot  of  attention  because  it  is  not 
only  sustainable,  but  also  is  renewable  and  this  means  that  we  will  never  run  out  of  it  but  the  potential 
of  using  this  form  of  renewable  energy  depends  on  its  accessibility.  Due  to  the  fact  that  the  number  of 
meteorological  stations  where  global  solar  radiation  (GSR)  is  recorded,  is  limited  in  Iran  we  were  meant 
to  develop  four  distinctive  models  based  on  artificial  intelligence  in  order  to  prognosticate  GSR  in  Tehran 
province,  Iran.  Accordingly,  the  polynomial  and  radial  basis  function  (RBF)  are  applied  as  the  kernel 
function  of  Support  Vector  Regression  (SVR)  and  input  energies  from  different  meteorological  data 
obtained  from  the  only  station  in  the  studied  region  were  selected  as  the  inputs  of  the  model  and  the 
GSR  was  chosen  as  the  output  of  the  models.  Instead  of  minimizing  the  observed  training  error, 
SVR„poly  and  SVR_rbf  attempt  to  minimize  the  generalization  error  bound  so  as  to  achieve  generalized 
performance.  The  experimental  results  show  that  an  improvement  in  predictive  accuracy  and  capability 
of  generalization  can  be  achieved  by  the  proposed  approach.  The  calculated  root  mean  square  error  and 
correlation  coefficient  disclosed  that  SVR_  rbf  performed  well  in  predicting  GSR.  Comparing  SVR_rbf 
results  with  SVR_poly,  ANFIS,  and  ANN  reveals  that  SVR_rbf  outperforms  the  POLY  model  in  terms  of 
prediction  accuracy. 
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1.  Introduction 

For  decades,  there  has  been  a  positive  correlation  between 
urban  air  quality  and  fossil  fuels  such  as  oil,  gas,  and  coal.  In  other 
words,  production  and  consumption  of  different  types  of  fossil 
fuels  for  producing  energy  pose  significant  environmental  chal¬ 
lenges.  Other  sources  of  energy  comprise  “clean  or  renewable 
energy,”  such  as  bioenergy,  geothermal  energy,  run-of-the-river, 
wind  power  and  solar  energy.  Our  energy  choices  affect  the  air  we 
breathe  and  the  global  atmosphere.  Policy  approaches  must  align 
energy  and  environmental  issues  to  ensure  that  economic  growth 
and  environmental  protection  are  achieved  simultaneously.  The 
administration  challenge  must  be  maximizing  the  benefits  gained 
from  energy  consumption  while  minimizing  the  costs  incurred. 

Among  the  different  forms  of  renewable  energy,  solar  energy 
has  attracted  a  great  deal  of  attention  because  it  is  not  only 
sustainable,  it  is  renewable  and  this  means  that  we  will  never  run 
out  of  it  [1-3],  It  is  about  as  natural  a  source  of  power  as  it  is 
possible  to  generate  electricity.  The  creation  of  solar  energy 
requires  little  maintenance.  Once  the  solar  panels  have  been 
installed  and  are  working  at  maximum  efficiency  there  is  only  a 
small  amount  of  maintenance  required  each  year  to  ensure  they 
are  in  working  order  [4-8].  The  potential  of  clean  energy  in  a 
region  depends  on  its  accessibility.  The  yearly  average  solar 
radiation  in  Tehran  province— a  province  in  the  center  and  the 
capital  of  Iran  -  is  about  4.92  kWh/m2  day  [9].  This  abundance  of 
solar  energy  helps  energy  policy  makers  to  develop  solar  energy 
systems  (solar  power  plants  and  solar  heating  systems),  which  are 
attractive  alternatives  to  traditional  power  plants  that  burn  fossil 
fuels  such  as  oil  and  coal.  Also,  agricultural  systems  such  as 
greenhouses,  chicken  farms,  and  dairy  industry,  are  of  the  largest 
consumers  of  heating  energies.  Installation  of  solar  energy  systems 
with  the  aim  of  supplying  heating  need  or  required  electricity  can 
help  us  to  achieve  economic  efficiency  and  reduce  the  emission  of 
greenhouse  gases. 

To  design  and  implement  every  solar  power  system  an  accurate 
detailed  long-term  knowledge  of  available  global  solar  radiation 
(GSR)  data  in  various  forms,  depending  on  the  related  application, 
is  needed  [10],  In  Iran,  the  number  of  meteorological  stations 
where  global  solar  radiation  (GSR)  is  recorded,  is  limited  [11], 
Moreover,  even  at  these  stations,  there  may  be  many  days  when 
GSR  data  are  missing  or  lie  outside  the  expected  range.  On  the 
other  hand,  analysis  of  GSR  and  meteorological  field  can  be 
performed  with  experimental  techniques  because  it  is  very  diffi¬ 
cult  and  time  consuming  to  measure  GSR  at  meteorological 
stations.  Thus,  soft  programming  techniques  (Artificial  neural 
Network,  Fuzzy-logic,  Adaptive-Network-Based  Fuzzy  Inference 
System,  etc.)  can  be  used  as  powerful  tools  to  analyze  and  predict 
GSR.  Several  researchers  have  used  these  techniques  to  estimate 
GSR  as  a  function  of  meteorological  data  and  a  lot  of  predictive 
methods  have  been  developed  by  them  around  the  world.Jiang 
[12]  developed  an  ANN  model  for  estimating  monthly  mean  daily 
global  solar  radiation  of  8  typical  cities  in  China.  It  is  found  that  the 
solar  radiation  estimations  by  ANN  are  in  good  agreement  with 
the  measured  values  and  are  superior  to  those  of  other  available 
empirical  models.  In  a  comprehensive  research  in  Andalusia 
(Spain)  Linares-Rodriguez  et  al.  (2011)  present  a  high  correlation 
coefficient  predictor  GSR  model  with  four  meteorological  data 
included:  total  cloud  cover,  skin  temperature,  total  column  water 
vapor  and  total  column  ozone  for  nine  years  from  83ground 
stations  spread  over  the  region.  They  used  this  data  as  input  of 
ANN  technic  and  GSR  as  output  [13],  Voyant  et  al.  [14]  tested  three 
single  methodologies  in  clouding  multi-layer  perceptron  (MLP), 
auto-regressive  and  moving  average  (ARMA),  and  persistence 
models  in  order  forecast  GSR.  They  concluded  that  the  hybrid¬ 
ization  of  the  three  predictors  (ARMA,  MLP  and  persistence) 


produced  better  results.  In  Saudi  Arabia,  Benghanem  and  Mellit 
[15]  applied  Radial  Basis  Function  network  (RBF)  for  modeling 
and  predicting  the  GSR.  In  their  research,  it  was  found  that  in 
RBF-models  sunshine  duration  and  air  temperature  as  input  para¬ 
meters  were  the  most  important  parameters  in  the  prediction  of 
GSR.  In  another  research  study  conducted  by  Wu  et  al.  [16],  it  was 
proposed  that  a  genetic  approach  combing  multi-model  frame¬ 
work  be  used  for  solar  radiation  time  series  prediction.  Mostafavi 
et  al.  [17]  developed  new  prediction  equations  for  the  GSR  using 
an  integrated  search  method  of  genetic  programming  (GP)  and 
simulated  annealing  (SA),  called  GP/SA  by  the  monthly  data.  These 
models  are  a  step  in  progressing  renewable  energy  plants.  For 
example  Hernandez  et  al.  (2012)  and  Laidi  et  al.  (2013)  used  GSR 
as  one  of  the  inputs  of  a  model  for  prediction  coefficient  of 
performance  (COP)  of  a  solar  intermittent  refrigeration  system 
for  ice  production  [18,19],  Thus  planting  a  GSR  predicting  model  in 
each  model  is  useful  for  future  environmental  planning. 

This  review  literature  shows  the  valuable  application  of  artifi¬ 
cial  intelligence  in  the  field  if  solar  radiation  modeling,  thus,  it  was 
decided  to  explore  the  possibility  of  these  techniques  for  devel¬ 
oping  a  unified  correlation  for  predicting  GSR  in  Tehran  province. 
Support  vector  (SV)  method,  which  analyze  data  and  recognize 
patterns  and  is  used  for  classification  and  regression  analysis, 
provides  a  universal  new  tool  for  solving  multi-dimensional 
function  estimation  problems  [20].  Some  researchers  used  support 
vector  machine  (SVM)  and  support  vector  regression  (SVR)  for 
developing  GSR  predictor  models.  Zeng  and  Qiao  [10]  proposed  a 
least-square  (LS)  support  vector  machine  (SVM)-based  model  for 
short-term  solar  power  prediction  (SPP)  in  the  USA.  The  inputs  of 
the  model  were  historical  data  on  atmospheric  transmissivity  sky 
cover,  relative  humidity,  and  wind  speed.  The  output  of  the  model 
was  the  predicted  atmospheric  transmissivity,  which  then  was 
converted  to  solar  power  according  to  the  latitude  of  the  site  and 
the  time  of  the  day.  Their  results  demonstrated  that  the  proposed 
model  not  only  significantly  outperformed  a  reference  autoregres¬ 
sive  (AR)  model  but  also  achieved  better  results  than  a  radial  basis 
function  neural  network  (RBFNN)-based  model  in  terms  of  pre¬ 
diction  accuracy.  In  other  studies,  the  feasibility  of  SVMs  in 
estimating  solar  radiation  using  air  temperatures  and  sunshine 
duration  was  examined  [21,22], 

This  paper  aims  to  present  feasibility  applying  radial  basis  SVR 
to  predict  global  solar  radiation  based  on  some  simple  meteoro¬ 
logical  data. 


2.  Material  and  methods 

2.1.  Study  area  and  data  set 

Difficulties  in  measuring  GSR  and  the  uncertainty  of  the  mea¬ 
sured  data  have  been  caused  considerable  effort  to  be  undertaken  to 
develop  procedures  and  software  for  prediction  and  quality  assess¬ 
ment  of  these  data.  Such  assessments  are  needed  to  ensure  that  the 
data  selected  for  various  applications  are  of  the  highest  quality 
available.  Except  GSR,  other  meteorological  data  are  parameters  that 
are  routinely  recorded  at  a  large  number  of  climatological  stations 
(manned  and  automatic),  due  to  the  low  cost  of  the  respective 
recording  instrumentation  and  the  ease  of  data  acquisition. 

Tehran  province  with  an  area  of  730  square  km  was  selected  as 
the  study  area.  This  province  is  located  in  35  °N  latitudes  and  51° 
West  longitude,  in  the  north  central  of  Iran.  Measured  daily  data 
belonging  to  a  seven-year  period  (1994  to  2000)  was  collected 
from  the  Islamic  Republic  of  Iran  Meteorological  Office  data  center 
[11  ].  This  is  the  solo  station  in  Tehran  province.  The  yearly  average 
of  solar  radiation  in  the  studied  region  is  4.92  kWh/m2  day  [9], 
The  monthly  mean  daily  temperature  ranged  from  a  minimum 
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Fig.  1.  Mean  sunshine  duration  for  selected  station. 

of  -1.5  °C  in  January  to  a  maximum  of  33.9  °C  in  July  [11],  The 
area  gets  sufficient  bright  sunshine  hours  throughout  the  year. 
The  average  bright  sunshine  hours  are  about  8-9  h  per  day.  Yearly 
average  sunshine  durations  for  selected  station  are  depicted 
in  Fig.  1. 

Maximum  and  minimum  temperature,  the  actual  duration  of 
sunshine  (n,  hour),  daylight  hour  (N,  hour)  that  is  the  maximum 
possible  duration  of  sunshine,  number  of  days  between  1  (January 
1st)  and  365  or  366  (December  31st),  clear-sky  solar  radiation 
( Rso  MJ/m2  day)  and  extraterrestrial  radiation  ( Ra ,  MJ/m2  day)  are 
some  of  the  most  important  daily  parameters  that  affect  the  solar 
radiation.  In  this  study,  these  parameters  have  been  selected  as 
inputs  of  the  developed  model  to  find  the  best  relation  between 
them  and  solar  radiation  (output  of  the  model).  The  extraterres¬ 
trial  solar  radiation  that  is  the  solar  radiation  received  at  the  top  of 
the  earth's  atmosphere  on  a  horizontal  surface,  for  each  day  of 
the  year  and  for  different  latitudes  can  be  estimated  by  following 
equation  [23]: 

Ra  =  24  x  60/4®s  sin  (<p)  sin  (<5)  +  cos  (<p)  cos  (5)  sin  (ffls)]GScdr  (1) 

where  Ra  is  the  extraterrestrial  radiation  (MJ/m2  day),  a>s  is  the 
sunset  hour  angle  (rad),  cp  latitude  (rad)  that  is  positive  in  the 
northern  hemisphere  and  negative  in  the  southern  hemisphere, 
5  is  the  solar  declination  angle  (rad),Gsc  is  solar  constant  that  is 
equal  to  0.0820  MJ/m2  min,  dr  is  an  inverse  relative  distance  of 
earth-sun  (dimensionless).  The  expressions  for  dr,  S,  and  a>s  can  be 
found  in  reference  [24], 

Rso  is  fraction  of  extraterrestrial  radiation  reaching  the  earth  on 
clear-sky  days  ( n=N ).  This  parameter  can  be  calculated  as: 

Rso  =  (0.75  +  2  x  l(T5z)Ra  (2) 

where  z  is  station  elevation  above  sea  level  (m)  [23 [.The  training 
and  validation  data  sets  were  selected  by  the  randomization  of  the 
input  data. 

2.2.  Supervised  machine  learning 

SVMs  are  a  type  of  supervised  machine  learning  technique  that 
is  part  of  a  generalized  linear  classifier  family.  The  formulation 
contains  the  structural  risk  minimization  (SRM)  concept,  as  a 
contrary  to  the  empirical  risk  minimization  (ERM)  approach  that 
is  widely  employed  in  the  statistical  learning  methods.  SRM 
mitigates  an  upper  bound  on  the  generalization  error,  unlike  the 
ERM  which  makes  the  error  minimal  in  the  training  data.  It  is  this 
difference  that  lends  support  to  the  SVMs  with  a  greater  potential 
to  generalize.  Furthermore,  the  solutions  offered  by  the  classical 
neural  network  models  may  be  prone  to  fall  into  a  local  optimal 
solution,  whilst  a  global  optimum  solution  is  assured  for  SVM. 
SVMs  can  be  used  in  both  problems  with  regards  to  classification 
and  regression. 


2.2.  J.  Feature  space  and  kernel  functions 

The  fundamental  working  principle  of  the  SVMs  is  to  perform 
the  data-mapping  in  some  other  dot  product  spaces  (called  the 
feature  space)  through  a  non-linear  mapping  and  perform  the 
linear  algorithm  in  the  feature  space.  As  it  involves  the  evaluation 
of  a  dot  product,  the  feature  space  is  characterized  by  its  highly 
dimensional  nature  and  thus  it  necessitates  high  computational 
resources  and  time.  In  some  cases,  nonetheless,  a  less-complex 
kernel  should  be  formulated  and  its  efficiency  assessed.  Complex 
issues  in  the  real  world  need  a  more  expressive  hypothesis  space 
than  the  linear  functions,  as  the  already-existing  linear  learning 
machines  are  constrained  by  their  computational  superiority. 
To  put  in  other  words,  the  target  data  lack  the  ability  to  be 
expressed  as  a  simple  linear  combination  of  the  given  attributes. 
One  significant  property  of  linear  learning  machines  lies  in  its 
ability  to  be  expressed  in  a  dual  representation.  It  indicates  that 
the  hypothesis  could  be  expressed  as  a  linear  combination  of  the 
training  points,  in  order  for  the  decision  rule  to  be  able  to  be 
assessed  with  merely  the  inner  products  between  the  test  point 
and  the  training  points.  If  there  is  the  availability  of  a  way  of 
computing  the  inner  product  in  feature  space  directly  as  a  function 
to  the  original  input  points,  there  is  a  possibility  that  a  non-linear 
learning  machine  is  built,  and  it  is  known  as  a  direct  computation 
method  of  kernel  function,  which  is  denoted  by  K.  Alternatively 
speaking,  a  kernel  function  can  be  interpreted  as  a  function  k,  such 
that  for  all  x,z  eX, 

J ((x,z)  =  (0(x)0(z)>  (3) 

There  are  two  basic  conditions  of  a  kernel  function,  (3)  the 
function  has  to  be  symmetric,  i.e. 

K(x,  z)  =  <0(x)0(z)>  =  <0(z)0(x)>  =  K(z,  x)  (4) 

and  (4)  it  must  meet  the  Cauchy-Schwartz  inequality. 

I<(x,z)2  =  <0(x)0(z)>2  <  1 1 0(x)  1 1 2 1 1 0(z)  1 1 2  (5 ) 

In  the  above  equations,  despite  it  being  necessary,  however, 
to  promise  a  feature  space  as  defined  by  the  kernel  function  is  not 
sufficient.  Nevertheless,  once  characterized,  kernel  representa¬ 
tions  provide  an  optional  solution  by  projecting  the  data  into  a 
high-dimensional  feature  space  to  enhance  the  computational 
capability  of  the  linear  learning  machines.  From  the  multiple 
kernel  functions  available  to  develop  a  model,  nonlinear  kernel 
functions  are  likely  to  be  more  efficient  in  running  an  analysis  on 
the  intricate  relations  between  various  real-world  issues  and  are 
therefore  adopted  in  this  current  work.  This  study  manipulates 
a  type  of  SVM  learning  approach  comprising  of  RBF  kernel,  to 
construct  a  model  that  ascertains  the  relation  between  values  of 
global  solar  radiation  as  output  and  inputs  included:  maximum 
and  minimum  temperature,  actual  duration  of  sunshine,  daylight 
hour,  clear-sky  solar  radiation  and  extraterrestrial  radiation.  These 
input  parameters  are  readily  available  in  most  of  meteorological 
stations  and  can  be  helpful  for  estimation  GSR. 

2.2.2.  Radial  basis  function  as  kernel 

The  flexible  nature  of  the  SVM  is  attributed  to  the  usage  of 
kernel  functions  that  implicitly  chart  the  data  to  a  higher  dimen¬ 
sional  feature  space.  A  linear  solution  in  the  higher  dimensional 
feature  space  corresponds  to  a  non-linear  solution  in  the  original, 
decrease  dimensional  input  space.  This  makes  SVM  a  choice  that  is 
feasible  for  addressing  various  issues,  which  are  naturally  non¬ 
linear  [25],  There  are  some  accessible  methods  which  employ  the 
non-linear  kernels  inside  their  strategy  towards  regression  pro¬ 
blems,  simultaneously  applying  SVMs  [26],  One  specific  method 
requires  using  the  radial  schedule  function  (RBF)  known  as 
LS-SVMs.  The  main  benefit  of  LS-SVM  is  that  it  is  computationally 
more  efficient  than  the  customary  SVM  method,  since  the  LS-SVM 
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training  needs  only  the  solution  of  a  set  of  linear  equations  instead 
of  the  lengthy  and  computationally  demanding  quadratic  pro¬ 
gramming  problem  that  is  entailed  in  the  standard  SVM.  To 
compare  with  other  probable  kernel  features,  the  RBF  is  a  more 
compressed,  supported  kernel  and  this  makes  it  very  suitable  to 
restrict  the  computational  training  process  and  improve  the 
generalization  efficiency  of  LS-SVM,  an  attribute  of  great  value  in 
the  model  designing.  Therefore,  the  RBF,  with  the  parameter  cr,  is 
adopted  in  this  study. 


2.2.3.  Support  vector  regression  (SVR) 

e-support  vector  regression  (SVR)  was  put  forth  as  an  optional 
{-insensitive  loss  function.  The  objective  held  by  SVR  is  to  search 
for  the  function  with  the  most  c  deviations  from  the  real  destina¬ 
tion  vector  for  all  training  information  received  and  which  must  be 
as  flat  as  possible.  The  Kernel  function  is  recognized  for  its  concept 
of  the  non-linear  support  vector  regression.  SVR  needs  to  have 
fewer  established  user-defined  parameters  for  establishing  kernel- 
specific  parameters.  To  add,  the  optimal  values  of  the  legalization 
argument  C  and  size  errors  in  sensitive  area  c  need  to  be 
ascertained.  The  settings'  selection  controls  the  complexity  of 
the  prediction.  One  of  the  major  advantages  of  SVR  lie  in  the 
algorithm  which  includes  the  resolution  of  the  quadratic  program¬ 
ming  function  leading  to  a  distinctive,  optimum  and  thorough 
solution. 

In  SVR,  {Xi,y,)fL  ^  is  considered  as  a  training  set,  in  which  Xjrffp 
represents  a  p-dimensional  input  vector  and  y;e.H  is  a  scalar 
measured  output  that  denotes  the  system  output.  The  goal  is  to 
develop  a  function  y  =/(x)  which  represents  the  output  depen¬ 
dence  y,  on  the  input  x,.  The  form  of  this  function  is: 

y  =  wT  <p(x)+b  (6) 

where  w  is  known  as  the  weight  vector  and  b  the  bias.  This 
regression  model  can  be  constructed  using  a  nonlinear  mapping 
function  </>(  ■ ).  By  mapping  the  original  input  data  onto  a  high¬ 
dimensional  space,  the  non-linear  separable  problem  shifts  into 
being  linearly  separable  in  space.  The  function  </>(■)  =  Rp  ->  Rh  is 
largely  non-linear  function  which  maps  the  data  into  a  higher, 
possibly  infinite,  dimensional  feature  space.  The  main  difference 
from  the  standard  SVM  lies  in  the  fact  that  the  LS-SVM  involves 
equality  constraints  instead  of  inequality  constraints,  and  works 
with  the  least  squares  cost  function.  The  optimization  problem  and 
the  equality  constraints  are  interpreted  by  the  following  equa¬ 
tions: 


1  1  N 

mmJ(w,e)=-wTw+y^  Z  ej 

^  ^  i  =  1 

subject  to 

y,  =  wT0(Xj)+b+e,',  i  =  \,...,N 


(7) 


(8) 


where  e;  is  the  random  error  and  ye.H+  is  a  regularization 
parameter  in  optimizing  the  trade-off  between  minimizing  the 
training  errors  and  the  model's  degree  of  complexity.  The  objective 
is  now  to  search  for  the  optimal  parameters  that  minimize  the 
prediction  error  of  the  regression  model.  The  optimal  model  will 
be  selected  by  making  minimal  the  cost  function  where  the  errors 
e i  are  minimized.  This  formulation  corresponds  to  the  regression 
in  the  feature  space  and,  owing  to  the  fact  that  the  dimension  of 
the  feature  space  is  high,  even  possibly  infinitive;  this  problem 
does  not  have  an  easy  solution.  Therefore,  to  address  this,  the 
following  Lagrange  function  is  expressed: 


L(w.  b,  e;  a)  =  J(w.  e)  - 


N 


z 

i  =  1 


ai{wT0(xi)+b+ei 


-Ti) 


The  solution  of  Eq.  (9)  is  obtainable  by  making  partial  differ¬ 
entiation  with  regards  to  w,  b,  e;  a,  i.e. 

3L  N 

—  =0 ->w=  2  ctj0(x,j  (10) 

dw  ,■  =  1 


b  = 


N 


Z  o,  =  o 

i  =  1 


(ID 


dL 

det 


=  0— =  y  e,-, 


i=l,  ...,N 


(12) 


—  =  O^wT0(Xi)  +  b+  e,- y,  =  0,  i=  1, ...,  N  (13) 

Last  but  not  least,  the  estimated  values  of  b  and  at,  i.e.  b  and  a*’, 
can  be  obtained  by  solving  the  linear  system  and  the  consequent 
LS-SVM  model  can  be  expressed  as  follows: 

y  =/(x)  =  X  a]K(x,Xi)+  b  (14) 

i  =  1 

where  K(x,  x,-)  is  a  kernel  function.  In  this  case,  the  non-linear  RBF 
kernel  is  defined  as: 


K(x,  xf)  =  exp 


>l|x— x,j 


(15) 


where  o  is  the  kernel  function  parameter  of  the  RBF  kernel.  The 
regularization  parameter  y  is  also  important  in  the  LS-SVM  model 
and  it  determines  the  trade-off  between  the  fitting  error  mini¬ 
mization  and  the  smoothness  of  the  estimated  function.  It  is  not 
known  earlier  on  which  y  and  a  are  the  best  for  a  particular 
application  issue  to  achieve  the  maximum  performance  with 
LS-SVM  models.  The  value  of  the  kernel  function  has  to  be  tuned 
during  the  calibration  of  the  model.  A  prediction  model  based  on 
the  support  vector  regression  (SVR)  is  being  suggested  in  this 
paper  as  a  way  to  predict  solar  radiation.  Aiming  to  develop  an 
effective  SVR  model,  the  SVR  parameters  must  be  established  with 
care.  SVR  seeks  to  minimize  the  generalization  error  to  gain 
generalized  performance  instead  of  minimizing  the  training  error. 


2.3.  Artificial  neural  network 

In  order  to  evaluate  the  suggested  model,  artificial  neural 
network  (ANN)  was  developed  and  the  results  obtained  from 
ANN  were  compared  with  the  results  gained  from  SVM  [27].  To 
train  ANNs  seven  independent  variables  were  considered  as  inputs 
and  solar  radiation  was  selected  as  only  output.  The  network 
was  trained  using  75%  of  the  original  3707  experiments  and  the 
remaining  25%  of  the  rest  was  applied  to  create  a  test  dataset. 
Matlab  software  was  employed  and  ANN  toolbox  was  utilized  in 
order  to  develop  a  feed-forward  neural  network  with  one,  two  and 
three  hidden  layers. 

The  input  layer  corresponded  to  the  seven  selected  input  para¬ 
meters.  The  output  layer  corresponded  to  the  one  output,  i.e.,  the 
measured  solar  radiation.  These  networks  were  trained  using  the 
Levenberg-Marquardt  training  algorithm.  Here,  the  neural  net¬ 
work  trained  with  Levenberg-Marquardt  algorithm  was  termed  as 
Levenberg-Marquardt  neural  network  (LMNN). 


Table  1 

Performance  criteria. 


Criteria 

Calculation 

Root  mean  squared  error  (RMSE) 

RMSE=\Jl  i  (di  -yd2 

(16) 

Correlation  coefficient  ( R ) 

n  Z?=i(di-3i)(y,-yl) 

V  Z"  i  (3i  -3i)2i;f ,  Oi  -y,)2 

(17) 

(9) 
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Table  2 

User-defined  parameters  for  SVR_rbf,  SVR_poly. 


Support  vector  regression  RBF  kernel 


Polynomial  kernel 


C  y  e  C  d  e 

100  0.3  0.001  100  1  0.001 


ANF1S  Number  of  rules:10,  membership  function:  generalized  bell,  number  of  iteration:  1000,  identification  methods:grid  partitioning 

ANN  Learning  rale— 0.2,  momentum— 0.1,  hidden  nodes— 7,12,  number  of  iterations=1500 


Train  Data 


Test  Data 


Actual  Values 


Fig.  2.  Plot  of  observed  and  predicted  solar  radiation  with  the  original  data  set 
using  SVR  rbf  and  SVR_poly  model  during  training  (a)  and  testing  (b). 

2.4.  Performance  metrics 

To  evaluate  the  performance  of  the  rbf-svr  and  poly-svr  model 
several  measures  were  used  to  confirm  the  validity  of  the 
proposed  SVR  models,  and  ANN.  The  root  mean  squared  error 
(RMSE)  served  to  evaluate  the  differences  between  the  expected 
and  actual  values.  The  parameters  are  calculated  as  indicated 
in  Table  1. 

where  n  is  the  total  number  of  test  data,  d,  is  experimental 
value  and  y,  is  forecast  value,  d,  is  averaged  experimental  value 
and  y,  is  averaged  forecast  value. 


3.  Results  and  discussions 

RBF  was  applied  as  the  Kernel  function  for  the  prediction 
of  solar  radiation  in  this  study.  The  three  parameters  associated 
with  RBF  Kernels  are  C,  e  and  r.  SVM  model  accuracy  is  principally 
dependent  on  the  model  parameter  selection.  In  our  scheme,  a 
default  value  of  e=0.1  seemed  to  perform  well.  To  select  user- 
defined  parameters  (i.e.  C,  d  and  g),  a  large  number  of  trials  were 


Table  3 

Performance  indices  of  various  approaches. 


Method 

Training 

Testing 

Error 

(RMSE) 

Coefficient  of 
determination  ( R ) 

Error 

(RMSE) 

Coefficient  of 
determination  (R) 

SVR_rbf 

3.2 

0.900 

3.3 

0.889 

SVR_poly 

3.5 

0.883 

3.4 

0.887 

ANFIS 

3.7 

0.897 

3.8 

0.899 

ANN 

3.8 

0.895 

3.7 

0.894 

carried  out  with  different  combinations  of  C  and  d  for  polynomial 
kernels  and  C  and  g  for  radial  basis  function  kernels.  Table  2 
provides  the  optimal  values  of  user-defined  parameters  for  this 
dataset  with  polynomial  and  RBF  kernel-based  SVR.  For  a  reason¬ 
able  appraisal  of  outcomes  with  both  RBF  and  polynomial  kernels, 
a  similar  parameter  e  value  was  applied  with  SVR. 

To  evaluate  SVR  model  performance,  observed  solar  radiation  was 
plotted  against  the  predicted  ones.  Fig.  2(a)  illustrates  the  results  with 
the  performance  indices  between  observed  and  predicted  data  in  the 
training  phase,  while  Fig.  2(b),  indicates  the  results  for  the  testing 
phase,  respectively.  Generally  speaking,  as  seen  from  Fig.  2,  SVR_  rbf 
performed  well  in  predicting  GSR.  Comparing  SVR_rbf  results  with 
SVR_poly,  ANFIS,  and  ANN  reveals  that  SVR_rbf  outperforms  the  POLY 
model  in  terms  of  prediction  accuracy. 

To  evaluate  the  performance  of  the  proposed  method,  experi¬ 
ments  were  conducted  to  determine  the  relative  significance  of 
each  independent  parameter  (input  SVR)  on  the  solar  radiation 
(output).  The  root  mean  squared  error  (RMSE)  and  correlation 
coefficient  (R)  served  to  evaluate  the  differences  between  the 
expected  and  actual  values  for  SVR_poly  and  SVR_rbf.  Table  3 
compares  the  SVR_rbf  with  SVR_poly  models.  The  results  in 
Table  3  prove  that  proposed  model  is  capable  of  predicting  solar 
radiation  with  minimal  error  and  the  highest  accuracy. 

As  it  was  discussed,  SVR  models  are  the  best  model  for  the 
prediction  of  solar  potential  in  Tehran  province.  It  could  be  said 
that  determining  of  GSR  distribution  is  the  most  important 
parameter  for  designing  and  selection  of  solar  systems  not  only 
for  reducing  high  initial  cost  of  them  but  also  for  increasing 
collection  of  energy.  At  present,  such  systems  are  not  economically 
viable  in  agricultural  buildings  such  as  greenhouses  without 
carbon  trading  option  taken  into  account.  Therefore,  government 
support  through  financial  investment  and  subsidy  is  an  effective 
way  for  extending  these  systems  in  agricultural  sections  [24], 

To  make  sure  that  the  developed  models  can  be  generalized  to 
all  the  Tehran  Province,  the  best  network  (SVR-rbf)  was  employed 
to  simulate  new  data  set  obtained  by  Karaj  station  in  year  2008. 
This  station  is  located  in  the  west  of  Tehran  in  35  °48'N  and 
51  °00'E  in  latitude  and  longitude  respectively.  The  results  showed 
that  the  model  successfully  predicted  GSR  using  new  data  set  from 
another  station.  The  correlation  coefficient  between  observed  and 
predicted  GSR  was  0.93  (Fig.  3(a)).  As  can  be  observed  in  Fig.  3 
(b)  the  predicted  value  can  follow  their  actual  ones  with  high 
accuracy  and  minimal  error.  Accordingly,  it  can  be  concluded  that 
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Target 


the  developed  model  can  be  used  through  the  Tehran  Province  to 
forecast  GSR  based  on  meteorological  data. 


4.  Conclusions 

This  paper  presents  a  support  vector  regression  (SVR)  techni¬ 
que  to  present  a  model  for  prediction  global  solar  radiation  (GSR). 
This  model  addresses  the  technical  methodology  of  examining  the 
potential  for  the  suitability  of  energy  supply  plants  and  renewable 
energy  latency  in  a  region  for  support  material  for  urban  energy 
supply  planning  in  the  draft  plan  development  stage.  One  of  the 
main  characteristics  of  SVR  technic  in  this  model  is  that  instead  of 
minimizing  the  observed  training  error,  SVR  attempts  to  minimize 
the  generalized  error  bound  so  as  to  achieve  generalized  perfor¬ 
mance.  Two  SVRs  were  investigated:  the  first  one  is  a  radial  basis 
function  (SVR-rbf)  and  the  next  is  a  polynomial  function  (SVR- 
poly).  The  result  showed  that  the  SVR-rbf  is  better  than  SVR-polt  in 
predicting  GSR.  The  performance  of  the  SVRs  approaches  against 
the  results  provided  by  ANN  and  ANFIS  obtaining  interesting 
improvements  in  the  prediction  system.  Both  techniques  are 
better  than  ANN  and  ANFIS  in  terms  of  root  mean  square  error. 
However,  estimated  results  by  SVR  produce  remarkably  smaller 
estimation  errors  compared  to  ANNs.  Moreover,  SVR  takes  lesser 
computer  time  than  ANN  and  ANFIS  for  the  two  cases.  From  the 
results  it  can  be  concluded  that  SVR  method  can  predict  GSR  with 
higher  estimation  accuracy  and  shorter  computation  time. 

The  experimental  results  show  that  an  improvement  in  pre¬ 
dictive  accuracy  and  capability  of  generalization  can  be  achieved 
by  our  proposed  approach.  Results  show  that  SVR  can  serve  as  a 
promising  alternative  to  existing  prediction  models.  It  can  be  seen 
from  the  experiment  that  the  prediction  model  overcomes  the 


main  shortage  of  artificial  neural  network  without  defining  net¬ 
work  structure  and  trapping  in  the  local  optimum. 
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